Locality Optimizations for Parallel Computing Using Data Access Information

نویسنده

Martin C. Rinard

چکیده

Given the large communication overheads characteristic of modern parallel machines, optimizations that improve locality by executing tasks close to data that they will access may improve the performance of parallel computations. This paper describes our experience automatically applying locality optimizations in the context of Jade, a portable, implicitly parallel programming language designed for exploiting task-level concurrency. Jade programmers start with a program written in a standard serial, imperative language, then use Jade constructs to declare how parts of the program access data. The Jade implementation uses this data access information to automatically extract the concurrency and apply locality optimizations. We present performance results for several Jade applications running on the Stanford DASH machine. We use these results to characterize the overall performance impact of the locality optimizations. In our application set the locality optimization level has little e ect on the performance of two of the applications and a large e ect on the performance of the rest of the applications. We also found that, if the locality optimization level had a signi cant e ect on the performance, the maximum performance was obtained when the programmer explicitly placed tasks on processors rather than relying on the scheduling algorithm inside the Jade implementation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimizations for Parallel Computing Using DataAccess

Given the large communication overheads characteristic of modern parallel machines, optimizations that eliminate, hide or parallelize communication may improve the performance of parallel computations. This paper describes our experience automatically applying communication optimizations in the context of Jade, a portable, implicitly parallel programming language designed for exploiting task-le...

متن کامل

Cacheminer: A Runtime Approach to Exploit Cache Locality on SMP

ÐExploiting cache locality of parallel programs at runtime is a complementary approach to a compiler optimization. This is particularly important for those applications with dynamic memory access patterns. We propose a memory-layout oriented technique to exploit cache locality of parallel loops at runtime on Symmetric Multiprocessor (SMP) systems. Guided by application-dependent and targeted ar...

متن کامل

Hierarchical Domain Partitioning For Hierarchical Architectures

The history of parallel computing shows that good performance is heavily dependent on data locality. Prior knowledge of data access patterns allows for optimizations that reduce data movement, achieving lower data access latencies. Compilers and runtime systems, however, have difficulties in speculating on locality issues among threads. Future multicore architectures are likely to present a hie...

متن کامل

On the Cache Access Behavior of OpenMP Applications

The widening gap between memory and processor speed results in increasing requirements to improve the cache utility. This issue is especially critical for OpenMP execution which usually explores fine-grained parallelism. The work presented in this paper studies the cache behavior of OpenMP applications in order to detect potential optimizations with respect to cache locality. This study is base...

متن کامل

Parallel Data Mining for Association Rules onShared - memory

In this paper we present a new parallel algorithm for data mining of association rules on shared-memory multiprocessors. We study the degree of parallelism, synchronization, and data locality issues, and present optimizations for fast frequency computation. Experiments show that a significant improvement of performance is achieved using our proposed optimizations. We also achieved good speed-up...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

International Journal of High Speed Computing

دوره 9 شماره

صفحات -

تاریخ انتشار 1997

Locality Optimizations for Parallel Computing Using Data Access Information

نویسنده

چکیده

منابع مشابه

Optimizations for Parallel Computing Using DataAccess

Cacheminer: A Runtime Approach to Exploit Cache Locality on SMP

Hierarchical Domain Partitioning For Hierarchical Architectures

On the Cache Access Behavior of OpenMP Applications

Parallel Data Mining for Association Rules onShared - memory

عنوان ژورنال:

اشتراک گذاری